Spanish Morphological Generation with Wide-Coverage Lexicons and Decision Trees
نویسندگان
چکیده
Morphological Generation is the task of producing the appropiate inflected form of a lemma in a given textual context and according to some morphological features. This paper describes and evaluates wide-coverage morphological lexicons and a Decision Tree algorithm that perform Morphological Generation in Spanish at state-of-the art level. The Freeling, Leffe and Apertium Spanish lexicons, the J48 Decision Tree algorithm and the combination of J48 with Freeling and Leffe lexicons have been evaluated with the following datasets for Spanish: i) CoNLL2009 Shared Task dataset, ii) Durrett and DeNero dataset of Spanish Verbs (DDN), and iii) SIGMORPHON 2016 Shared Task (task-1) dataset. The results show that: i) the Freeling and Leffe lexicons achieve high coverage and precision over the DDN and SIGMORPHON 2016 datasets, ii) the J48 algorithm achieves state-of-the-art results in all of the three datasets, and iii) the combination of Freeling, Leffe and the J48 algorithm outperformed the results of our other approaches in the three evaluation datasets, improved slightly the results of the CoNLL2009 and SIGMORPHON 2016 reported in the state-of-the-art literature, and achieved results comparable to the ones reported in the state-of-the-art literature on the DDN dataset evaluation.
منابع مشابه
A Morphological and Syntactic Wide-coverage Lexicon for Spanish: The Leffe
In this paper, we introduce the Léxico de Formas Flexionadas del Español (Leffe), a wide-coverage morphological and syntactic Spanish lexicon based on the Alexina lexical framework. We explain how the Leffe has been created by merging together several heterogeneous lexicons and how the Alexina lexical framework has been applied to Spanish. We also introduce a semi-automatic technique based on a...
متن کاملMorpho-syntactic Lexicon Generation Using Graph-based Semi-supervised Learning
Morpho-syntactic lexicons provide information about the morphological and syntactic roles of words in a language. Such lexicons are not available for all languages and even when available, their coverage can be limited. We present a graph-based semi-supervised learning method that uses the morphological, syntactic and semantic relations between words to automatically construct wide coverage lex...
متن کاملBuilding a morphological and syntactic lexicon by merging various linguistic resources
This paper shows how large-coverage morphological and syntactic NLP lexicons can be developed by interpreting, converting to a common format and merging existing lexical resources. Applied on Spanish, this allowed us to build a morphological and syntactic lexicon, the Leffe. It relies on the Alexina framework, originally developed together with the French lexicon Lefff. We describe how the inpu...
متن کاملMorphology to the Rescue Redux: Resolving Borrowings and Code-Mixing in Machine Translation
In the IBM LMT machine translation system, derivational morphological rules recognize and analyze words that are not found in its source lexicons, and generate default transfers for these unlisted words. Unfound words with no inflectional or derivational affixes are by default nouns. These rules are now expanded to provide lexical coverage of a particular set of words created on the fly in emai...
متن کاملConstructing and Using Broad-coverage Lexical Resource for Enhancing Morphological Analysis of Arabic
Broad-coverage language resources which provide prior linguistic knowledge must improve the accuracy and the performance of NLP applications. We are constructing a broad-coverage lexical resource to improve the accuracy of morphological analyzers and part-of-speech taggers of Arabic text. Over the past 1200 years, many different kinds of Arabic language lexicons were constructed; these lexicons...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Procesamiento del Lenguaje Natural
دوره 58 شماره
صفحات -
تاریخ انتشار 2017